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ABSTRACT 



A method for tracking a predetermined, two-dimensional 
portion of an image throughout a sequence of images, the 
method comprises the steps of selecting a reference frame; 
selecting the predetermined, two-dimensional portion within 
the reference frame by choosing a polygon that defines the 
boundary of the predetermined portion; fitting a reference 
mesh having at least three corner nodes and an inside node 
to the reference polygon; tracking the reference polygon in 
subsequent or previous image frames by tracking the corner 
nodes; mapping the reference mesh into the tracked polygon 
in the subsequent or previous image frames; and refining 
locations of the inside and corner nodes in the subsequent or 
previous image frames for tracking local and global motion 
of the predetermined portion. 

13 Claims, 12 Drawing Sheets 
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METHOD FOR REGION TRACKING IN AN the template into sub-templates, and estimates the individual 

IMAGE SEQUENCE USING A TWO- displacement of each sub-template. The parameters of the 

DIMENSIONAL MESH affine transformation are found from the displacement infor- 
mation of the sub-templates. Although this method employs 

FIELD OF INVENTION 5 local displacement information, it does so only to find a 

Hie present invention is related to the field of digital global affine transformation for representing the motion of 

image processing and analysis and, more specifically, to a th / cntirc ob l ec '; before, while it tracks the global moUoo 

technique for tracking a two-dimensional portion of an of an ° b J cct > ll c / anno *** ^formations that 

image, feature of an image, or a particular object for occur within the object (i.e., local deformations), 

two-dimensional images that are sequentially placed in 10 Although the presently known and utilized methods are 

chronological order for display. satisfactory, they are not without drawbacks. In addition to 

the above-described drawbacks, they also do not take into 

BACKGROUND OF THE INVENTION account the effects of frame-to-frame illumination changes. 

In a wide variety of image sequence processing and Consequently, a need exists for an improved tracking 

analysis tasks, there is a great need for an accurate method technique that can track objects within a scene which are 

for tracking the intensity and motion of a portion of an image undergoing local deformations and illumination changes, 

throughout an image sequence. This portion, called the SUMMARY OF INVENTION 
reference region hereinafter, may correspond to a particular 

object or a portion of an object in the scene. 2Q The present invention provides an improvement designed 

Tracking the boundary of an object has been discussed in to satisfy the aforementioned needs. Particularly, the present 

M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active invention is directed to a method for tracking a 

Contour Models", International Journal of Computer predetermined, two-dimensional portion of an image 

Vision, volume 1, no. 4, pp. 321-331, 1988; F. Leymarie and throughout a sequence of images, the method comprising the 

M. Levine, "Tracking Deformable Objects in The Plane ^ steps of (a) selecting a reference frame; (b) selecting the 

Using An Active Contour Model". IEEE .Transactions Pat- predetermined, two-dimensional portion within the refer- 

tern Analysis and Machine Intelligence, volume 15, pp. ence frame by choosing a polygon that defines the boundary 

617-634, June 1993; K. Fujimura, N. Yokoya, and K. of the predetermined portion; (c) fitting a reference mesh 

Yamamoto, "Motion Tracking of Deformable Objects By having at least three comer nodes and an inside node to the 

Active Contour Models Using Multiscale Dynamic 30 reference polygon; (d) tracking the reference polygon in 

Programming", Journal of Visual Communication and subsequent or previous image frames by tracking the corner 

Image Representation, vol. 4, pp. 382-391, December 1993; nodes; (e) mapping the reference mesh into the tracked 

B. Bascle, et aL, "Tracking Complex Primitives in An Image polygon in the subsequent or previous image frames; and (f) 

Sequence", in IEEE International Conference Pattern refining locations of the inside and corner nodes in the 

Recognition, pp. 426-^31, October 1994, Israel; F. G. Meyer 35 subsequent or previous image frames for tracking local and 

and P. Bouthemy, "Region-Based Tracking Using Affine global motion of the predetermined portion. 

Motion Models in Long Image ^^J^^/f^ BRIEF DESCRIPTION OF DRAWINGS 
Understanding, volume 60, pp. 119-140, September W4, 

all of which are herein incorporated by reference. The i n the course of the following detailed description, refer- 

methods disclosed therein, however, do not address the ^ en ce will be made to the attached drawings in which: 

tracking of the local deformations within the boundary of the pjQ 1 & a perspective view of a computer system for 

object, implementing the present invention; 

Methods for tracking local deformations of an entire F\GS. 2A and 2B are flowcharts for the method of the 

frame using a 2-D mesh structure are disclosed in J. preS ent invention' 

Niewglowski, T. Campbell, and P. Haavislo «A Novel Video 45 3 illustrating the method of 

Coding Scheme Based on Temporal Prediction Using Digi- * 6 

tal Image Warping", IEEE Transactions Consumer * \ AA . , ,™ . 

Electronics, volume 39, pp. 141-150, August 1993; Y. FIG. 4 !S an exploded view of a portion of FIG. 3; 

Nakaya and H. Harashima, "Motion Compensation Based FIG. 5 is a diagram illustrating the corner tracking method 

on Spatial Transformations", IEEE Transaction Circuits and 50 of FIGS. 2A and 2B; 

System Video Technology, volume 4, pp. 339-357, June FIGS. 6A, 6B, 7 and 8 are diagrams further illustrating the 

1994; M. Dudon, 0, Avaro, and G. Eud; "Object-Oriented corner tracking method of FIGS. 2 A and 2B; 

Motion Estimation", in Picture Coding Symposium, pp. pIG. 9 is a diagram illustrating the method for mapping a 

284-287, September 1994, CA; C.-L. Huang and C.-Y. Hsu, reference mesh of FIGS. 2A and 2B; 

"A New Motion Compensation Method for Image Sequence 55 piGS. 10A, 10B, 11, 12Aand 12B are diagrams depicting 

Coding Using Hierarchical Grid Interpolation", IEEE Trans- a method f or rc fi n ing the location of inside nodes of the 

actions Circuits and System Video Technology, volume 4, reference mesh- 

pp. 42-52, February 1994, all of which are herein incorpo- p[G ^ ^ ; illustrati a logarithraic ^arch 

rated by reference. However, these methods always include meihod fof v& ^ of an ^ node; 

the whole frame as the object of interest. They do not 60 _ 4 _ °. ... . , 

address the problem of tracking an individual object bound- HO- 14A-14C ,s a chagram further dictating the 

ary within the frame. method of FIG. 13; 

U S. Pat. No. 5,280,530, which is herein incorporated by « G 15 » » Crating a logarithmic search 

reference, discusses a method for tracking an object within ° lelh ° d for refinln B the locatlon of a boundar y n ° d ^ , . 

a frame. This method employs a single spatial transforma- 65 FIG. 16 is a flowchart illustrating the method of FIG. 15; 

tion (in this case arnne transformation) to represent the FIG. 17 is a diagram illustrating a logarithmic search 

morion of an object. It forms a template of the object, divides method for refining the location of a comer node; 
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FIG. 18 is a diagram further illustrating the method of 
FIG. 17; 

FIG. 19 is a diagram illustrating a method of incorporat- 
ing illumination changes during motion tracking; 

FIG. 20A-20E is a diagram illustrating hierarchical hex- 
agonal search method. 

DETAILED DESCRIPTION OF INVENTION 

Referring to FIG. 1, there is illustrated a computer system io 
1 for implementing the present invention. Although the 
computer system 1 is shown for the purpose of illustrating 
a preferred embodiment, the present invention is not limited 
to the computer system 1 shown, but may be used on any 
electronic processing system (for example a SPARC-20 15 
workstation). The computer system 1 includes a micropro- 
cessor based unit 2 for receiving and processing software 
programs and for performing other well known processing 
functions. The software programs are contained on a com- 
puter usable medium 3, typically a disk typically, and are 20 
inputted into the microprocessor based unit 2 via the disk 3. 
A display 4 is electronically connected to the microprocessor 
based unit 2 for displaying user related information associ- 
ated with the software. Akeyboard 5 is also connected to the 
microprocessor based unit 2 for allowing a user to input 25 
information to the software. As an alternative to using the 
keyboard 5, a mouse 6 may be used for moving an icon 7 on 
the display 4 and for selecting an item on which the icon 7 
overlays, as is well known in the art. A compact disk — read 
only memory (CD-ROM) 8 is connected to the micropro- 
cessor based unit 1 for receiving software programs and for 
providing a means of inputting the software programs and 
other information to the microprocessor based unit 1. A 
compact disk (not shown) typically contains the software 
program for inputting into the CD-ROM 9. A printer 9 is 
connected to the microprocessor based unit 2 for printing a 
hardcopy of the output of the computer system 1. 

The below-described steps of the present invention are 
implemented on the computer system 1, and are typically 
contained on a disk 3 or other well known computer usable 40 
medium. Referring to FIGS. 2 and 3, there are illustrated five 
steps of the present invention which are first succinctly 
outlined and later described in detail. Briefly stated, these 
five steps are as follows: (i) selection of a reference frame 
and a reference polygon 10; (ii) fitting a 2-dimensional mesh 45 
inside the reference polygon 20; (iii) tracking the comers of 
the polygon in the previous frame 30; (iv) mapping the 
previous mesh onto the polygon in the current frame 40; and 
(v) local motion estimation via a hexagonal search and 
corner refinement 50. 50 
A Selection Of The Reference Frame And Polygon (Step 
10) 

Referring to FIGS. 2 and 3(a), in the first step 10, the user 
selects an object (i.e., the reference object) 11 within any 
frame ( i.e., the reference frame) 14 that is to be tracked for 55 
eventual replacement with another object, which is herein- 
after referred to as the replacement object 17. A convex 
reference polygon 12 is placed over the reference object 11 
so that the boundary of the reference polygon 12 coincides 
with the boundary of the reference object 11. The user 60 
creates the reference polygon 12 by selecting corners 13 of 
the reference polygon 12 for defining its boundary. It is 
advantageous to model the boundary of the reference object 
11 as a polygon for two reasons. First, the polygon can be of 
any size, and secondly, by tracking only corners, it can be 65 
determined how the boundary of the reference object 11 has 
moved from one frame to another frame. 



30 



35 



It is instructive at this point to clarify some of the notation 

used herein, which is as follows. T denotes the total 
number of frames in the image sequence in which the 
reference object 11 is to be tracked. For convenience, the 
user renumbers these frames, typically starting with 1, in 
which the reference object 11 is to be tracked. In this regard, 
f„ denotes the renumbered frames, P„ denotes the polygon in 
fn, and Mn denotes the mesh in fn, where l = n ^T. 
Furthermore, r denotes the sequence number of the reference 
frame 14, and f^ P r , and M r respectively denote the refer- 
ence frame 14, the reference polygon 12, and the reference 
mesh 21. Finally, L denotes the number of corners of P r . 

The time order of processing is arbitrary and docs not 
affect the performance of the method. Preferably, the for- 
ward time direction is first chosen so that the reference 
object 11 is first tracked in frames with sequence numbers 

(r+1), (r+2) . . . , 7 and then in frames with sequence 
numbers (r-l),(r-2) . . . , 1. 

B. Fitting a 2-D Mesh Into The Reference Polygon (Step 20) 
Referring to FIGS. 2, 3 and 4, the next step 20 involves 
fitting a mesh 21 to the reference polygon 12, called the 
reference mesh 21 that is subsequently tracked. It is the 
subdivisions of the mesh 21 that allows for tracking regions 
that exhibit locally varying motion, such as those corre- 
sponding to objects within a particular scene having either 
curved or deforming surfaces or both in combination. The 
subdivisions, or patches 22, of the reference mesh 21 are 
defined by the locations of the nodes 23 of the reference 
mesh 21. For example, FIG. 4 shows a depiction of a 
triangular mesh 21 fit into a quadrilateral reference polygon 
12. 

To create the reference mesh 21, the reference polygon 12 
is first placed on a regular rectangular grid. The dimensions 
of each rectangle in the regular rectangular grid are specified 
by the user. The non-rectangular cells (e.g., trapezoids, 
pentagons, etc.) that are formed along the boundary of the 
reference polygon 12 are divided into appropriate number of 
triangles as shown in FIG. 4. If it is desired that the reference 
mesh 21 contain only triangular elements, each rectangular 
cell is further divided into two triangular elements. Thus, the 
reference mesh 21 consists of patches 22 that are of the same 
size except for the ones that are around the boundary of the 
reference polygon 12. It is instructive to note that the nodes 
23 are also corners of the patches 22. As may be obvious to 
those skilled in the art, the mesh 21 is completely described 
by the collection of its patches 22. 

Referring to FIG. 2, once the reference mesh 21 is 
determined, the frame number n is set to r+1, step 25a, and 
if n^T 26a, frame f„ is read in step 27a. Once the reference 
mesh 21 is tracked in frame 4 using the Steps 30a, 40a, and 
50a, the frame number n is incremented by 1 in 28a and the 
incremented value is compared with T in step 26a. The Steps 
26a, 27a, 30a, 40a, 50a, and 28a, are repeated until n>T. 
When n>T, the frame number n is set to r-1 in 25b and 
compared with 1 in step 26b. If n^l, frame f M is read in step 
27b, Then, the reference mesh 21 is tracked in frame f„ using 
the Steps 30b, 40b, and 50b, and the frame number n is 
decreased by 1 in step 28b. Steps 26b, 27b, 30b, 40b, and 
50b, 28b are repeated until n<l. 

Referring to FIGS. 2 and 3(b), hereinafter, f„ is called the 
current frame 114, P„ is called the current polygon 112, and 
M M is called the current mesh 121. The previous frame 214 
refers to the frame f„_l if n>r, or to the frame if n<r. It 
is instructive to note that for both n-r+1 and n»r-l, the 
previous frame 214 is in fact the reference frame 14. 
Furthermore, the tracked reference polygon 12 in the pre- 
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vious frame 214 is called the previous polygon 212, and tbe C2. Defining The Cost Polygon For A Corner 

tracked reference mesh 21 in the previous frame 214 is In order to track a corner 213 of the previous polygon 212 

called the previous mesh 221. into the current frame 114, the user is required to specify a 

C. Tracking Corners Of The Polygon (Step 30) region 31 of pixels around each corner 13 in the reference 

The corners 213 of the previous polygon 212 are inde- 5 f»rne 14 that permits the motion model selected for that 

pendently tracked into the current frame .„, 114, as shown in «™er 13. Tbis region is specified m the form of a polygon 

FIG. 5, to find an initial estimate of the current polygon 112. hereinafter defined as tbe cos polygon 3L The cost 

This initial estimate is then refined using the corner refine- Polygon 31 can be defined asa rectangular block centered 

. . , . . ... & . around the comer 13 as shown m FIG. 6(a), or it can t>e 

ment process as^explained later in this ►section. defined ^ a , havi ^ CQmT n K Qne of its 

Referring to FIGS. 2, 3, 5, 6, 7, and 8 the corner tracking 10 whU e ^p^y remaining inside the reference 

method includes the following steps: (1) selecting a motion po iy gon 12. In the latter case, one possible choice for the 

model for the corners 13, (2) assigning a cost polygon 31 to ^ polygon 31 is a scaled^own version of the reference 

each corner 13 (3) finding the best motion parameters for polygon 12 placed at the corner 13 as shown in FIG. 6(b). 

each cost polygon 231 in the previous frame 214 using a j t ^ instructive to note that the size of the cost polygon 31 

logarithmic search method, and mapping the corners 213 of 35 SD ould be as large as possible provided that the pixels within 

the previous frame 214 into the current frame f„ 114 with the the cost polygon 31 permit the same motion parameters, 

best motion parameters found for their respective cost In the following, K denotes the number of search regions 

polygons 231. In the following, we give a detailed descrip- specified by the user. As indicated earlier, the number of 

tion of these steps. search regions are determined by the complexity of the 

CI. Selecting A Motion Model For A Comer (Step 30) 20 motion model. Let C denote the cost polygon 31, and let L 

Depending on the local motion around the corners 13, one denote the total number of its comers 33, e.g., L»4 in FIG. 

selects for each comer 13 one of the following models: (i) 6(a), and L-5 in FIG. 6(b). Each search region is assigned 

translation, (ii) translation, rotation, and zoom, (iii) affine, to a distinct comer 33 of the cost polygon 31. Thus, it is 

(iv) perspective, and (v) bilinear. The translational model is required that K^L. Referring to FIG. 7, the corners 33 that 

the simplest one, and should be preferably selected if this 25 are assigned a search region are called moving corners (MC) 

model is applicable as one well skilled in the art can 32. Obviously, if K-L, then all comers 33 will be moving 

determine . If the translational model is selected, the user will corners 32. The moving corners 32 are numbered from 1 to 

be required to specify a rectangular search region to indicate K in the order of increasing motion complexity (i.e., MC 3 

the range of translational displacement for the corner 13. introduces translational motion; MC 2 introduces rotation 

If the local motion around a comer 13 involves either 30 and zoom if Ki2; MC 3 introduces shear and directional 

rotation or zoom or both in combination, which is easily scaling if K £3; and MC 4 introduces perspective or bilinear 

determined by one well skilled in the art, we employ the deformation if K=4 ). One possible definition for the moving 

second model, i.e., translation, rotation, and zoom. In this corners 32 is given by 
case, in addition to a rectangular search region for the 

translational part of the motion, the user will be required to 35 sq = c WKl 1 = 1, 2, . . . , k. (l) 
specify a second rectangular search region to indicate the 
range of rotation and zoom. 

On the other hand, if there is shear and/or directional where |xj denotes the largest integer not greater than x, and 
scaling of pixels around a comer 13, we employ the third C, stands for the comer of C, e.g., as shown in FIG. 7, for 
motion model, namely the affine motion model. In order to 40 L-5 and K-3, we have MCj-Cj , MCj-Q, and MC 3 -C 5 . 
find the affine motion parameters, the user wilt need to C3. Finding The Best Motion Parameters For A Cost Poly- 
specify three rectangular search regions: one for the trans- gon 

lational part, one for the rotation and zoom part, and one for Referring to FIGS. 5, 6, 7 and 8, the method for tracking 

the shear and directional scaling. a corner 213 of the previous polygon 212 into the current 

Finally, if perspective or nonlinear deformations are 45 frame 114 is as follows. The following is repeated for each 

observed in the neighborhood of the comer 13, we employ corner 213 of the previous polygon 212. 

the fourth or the fifth motion model, respectively. For both Let R„ i-1, . . . , K, denote the locations of the moving 

the fourth and the fifth models, the user will specify four corners (MC<, i-1. . . ,K) 32 of the cost polygon 31 in the 

rectangular search regions- three of them will determine the reference frame 14, and let P,, i-1, . . . ,K, respectively 

extent of the affine motion, and the remaining one will 50 denote the initial locations of the moving comers 132 of the 

determine the amount of perspective or bilinear deforma- cost polygon 131 in the current frame 114. The initial 

tion. As will be explained later, as the complexity of the locations of the moving corners 132 in the current frame are 

motion model increases, i.e., as the number of search regions obtained from the locations of the moving comers 232 of the 

increases, so does the computational requirements for find- cost polygon 231 in the previous frame 214. 

ing the best parameters for the model. Therefore, in order to 55 Let D* denote the best mapping for the comer 213, i.e., 

reduce the computational requirements, it is preferred to use D* is the mapping that gives the best location of the comer 

during comer tracking, step 30, as simple a motion model as 213 in the current frame 114. In the following, a method is 

allowed by the characteristics of the motion around the given to determine D*. Let I r denote the intensity distribu- 

corner 13, tion in the reference frame 14, and let l c denote the intensity 

The size of each search region is defined in integer powers 60 distribution in the current frame 114. Let h,-, v if i-1, . . . ,K, 

of 2. For each search region, the user specifies two integers, denote the integers that respectively determine the size of 

one to indicate the horizontal dimension and one to indicate the search regions 34 for MC lt i-1, . . . , K. The user also 

the vertical dimension of the search region. Thus, if h and v specifies the accuracy of the search as an integer power of 

denote the integers specified by the user to respectively 2. Let a denote the accuracy of the search, 

indicate the horizontal and vertical dimensions of the search 65 We are now ready to step-by-step describe the logarithmic 

region, then the size of the search region is given by search method used for comer tracking. A demonstration of 

2 A+2 x2 v+2 . the following is given in FIG. 8. 
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1. MCj is moved to 9 different locations in the current 
frame 114 that are given by 



■Pi + 



2*i mi 



, mi, n t = -1, 0, 1. 



(2) 



6. We let t/^m,*, n/, . . . , m^*, n^*) denote the index 
values that correspond to D*. We decrement the values 
ofh^Vj,. .^hj^v^by 1. If h„ v 1( <a for all 1»1, . . 
K, we have found the best model parameters with the 
desired accuracy and thus we stop. Otherwise we let 



2. If K»l, we find the translational mappings 

01 W| : rj , m t , rtl =-1,0,1 



Pi 



(3) 



10 



and compute the matching error value for each mapping, 



The best translational mapping is the one for which the 
matching error value is the smallest, i.e., 



D* = miiv,^!— i.o.i ^lani^ J- 



(5) 



We then move to Step 6. If K^2, however, we let 

qj-pj and ^"(mpn^ .... m^nj, k»l, . . . K, (6) 

for notational simplicity, and continue. 
3. We let K=2 and find the following 9 translational 
mappings 



M i7mvAi :qi ^Sr^^^m^m = -1,0, 1 



4. We move MC k to the following 9* locations in the 
current frame 114. 



fl^i-l = M k . x . Jk ^P k ,m x ,n x mi-t,« t _i =1.0,1 



For each ^ we compute the following 9 different loca- 
tions 



m*, n k = -1.0, 1 . 



and go to Step 1 to implement the next level of the 
logarithmic search. 

Once the best model parameters D* are found, the corner 
13 of the reference polygon 12 is mapped to the current 
15 frame 114 with D*, and the above procedure is repeated for 
remaining corners 13 of the reference polygon 12. 

A method for finding D* is given in V. Seferidis and M. 
Ghanbari, "General Approach to Block-Matching Motion 
Estimation," Optical Engineering, volume 32 (7), pp. 
20 1464-1474, July 1993. The presently disclosed method is an 
improvement over "General Approach to Block Matching 
and Motion Detection," because (1) the effect of each 
component of the motion model on the transformations of 
the cost polygon 31 is controlled by the search region 
25 associated with that component of the motion model, and (2) 
non-convex polygons are less likely to occur during the 
logarithmic search process due to the cumulative nature of 
the movements of the corners 32 of the cost polygon 31. 
Furthermore, the error expressions (Equations 4 and 12) 
30 used in the presently disclosed method can be modified 
according to C.-S. Fuh and P. Maragos, "Affine models for 
image matching and motion detection," in IEEE Interna- 
tional Conference Acoustic Speech and Signal Processing, 
pp. 2409-2412, May 1991, Toronto, Canada, so that illumi- 
35 nation changes in the scene are also incorporated during 
corner tracking, step 30. 
D. Mapping The Previous Mesh (Step 40) 

Referring to FIG. 9, once an initial estimate of the current 
polygon 112 is determined in step 30, the next step 40 is to 
40 map the previous mesh 221 into the current polygon 112 to 
obtain an initial estimate for the current mesh 121. An initial 
estimate of the current mesh 121 is obtained by mapping the 
nodes 223 of the previous mesh 221 into the current polygon 
112 using a set of affine transformations as follows. 
(9) 45 Ld M p and M c respectively denote the previous mesh 221 
and the current mesh 121, with c and p respectively denoting 
the numbers of the current 114 and previous 214 frames, 
such that 



(7) 



(8) 



(c - 1, if c <r 
c + 1, if c> r 



(14) 



If k<K we find the mappings 

50 

AWfe.. ftsf 4 -|)^( J ivi» J 2;'2- ■ ■ - -W> m i' 1 < c t r < T and p 

« t — 1,0,1. 0°) 

increment k by 1, and repeat Step 4; otherwise, i.e., if k=K, 

we continue where r denotes the reference frame number. Let P p and P c 

5. We find the following 9* mappings 55 respectively denote the previous polygon 212 and the cur- 



Ot-^n, .... rj— . . . , s kv ), m lt rtp..., m to n A —lAOl) 



rent polygon 112. In order to compute the aforementioned 
affine transformations, first divide ? p and P c into (L-2) 
triangles 45. Then the triangular division can be formulated 
and compute for each one of them the matching error values ^ f 0 n 0W s: 

60 

E k , k = £ \l r [x)- l c {D ktk x)\\ «„ m m t , n k = -1, 0, 1 02) ^ = ^ p.^ p.^ for* = 1 L-2, i ~ p, c, (15) 



_ L . . , . u- u *u . u- „ where R. is the *th triangle on p i Divide P and P c as in (15) 

The best mapping is the one for which the matching error ^ ^ affine transformalions A ^ : p ' p k<al , 

L-2. All nodes g M 223 of M p for n-1, . . . ,N, where N is the 
(U) total number of nodes 223, are visited sequentially and 



value is the smallest, i.e., 



. ^Tlfc fljfc l.Ol, 



{iW- 
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mapped into the current polygon 112 as A m (g n ) if g„E R^. 
Mapping the corners 223 of a patch 222 in the previous 
polygon 212 to the corners 123 of a patch in the current 
polygon 112 is shown in FTG. 9. Based on the properties of 
affine transformation as explained in G. Wolberg, "Digital 
image warping, "IEEE Computer Society Press, 1992, Los 
Alamitos, Calif., which is herein incorporated by reference, 
if a node 23 is on the boundary of two triangles 45, it is 
mapped to the same location by the affine transformations 
obtained for both triangles. The current mesh M c 121 
constructed from the previous mesh M 221 using the affine 
transformations as denned above is called the initial current 
mesh 121 in the current frame 114, and is refined by using 
the method given in the following section. 
E. Hexagonal Search and Comer Refinement (Step 50) 

Referring to FIGS. 10 through 18, an efficient search 
strategy is employed to refine the initial current mesh 121 on 
the current frame 114. This allows for handling image 
regions containing locally varying motion, i.e., image 
regions corresponding to scene objects with curved surfaces 
or surfaces that undergo mild deformations (i.e, deforma- 
tions that do not cause self occlusion of parts of the object). 
The invention also discloses a method to account for pos- 
sible changes in the illumination in the scene. The detailed 
description of this method is furnished later in this Section. 

Let N and M respectively denote the number of nodes 23 25 approach in J. Niewglowski, T. Campbell, and P. Haavisto, 
and patches 22 in the reference mesh 21. Also, let g f and r, "A Novel Video Coding Scheme Based on Temporal Pre- 
respectively denote the 'th node 123 and the ''th patch 122 in diction Using Digital Image Warping, IEEE Transactions 
the current mesh 121, where . . . , N and j-1, . . . , M. Consumer Electronics, vol. 39 pp. 141-150 August 1993. 
Each patch 122 in the current mesh 121 is allowed to go The present mvention refines the positions of the boundary 
through spatial warpings that are either affine or bilinear by 30 nodes 52 using a variation of the ; hexagonal ^rch method, 
moving the nodes 123 of the mesh 121. Affine and bilinear 



the sides of the patches 122 in the current frame 114 always 
remains continuous as the nodes 123 are moved. In this step, 
the corners 113 of the current polygon 112 are also refined, 
as will be described in detail below. 

Referring to FTG. 4, three different types of nodes 23 are 
identified on the reference mesh 21, they are as follows: 
nodes 51 that are inside the polygon 12 (inside nodes), nodes 

52 that are on the boundary of the polygon 12 (boundary 
nodes), and nodes 53 that are at the corners of the "polygons 
12 (comer nodes). Once the initial current mesh 121 is 
obtained, the positions of inside 51, boundary 52, and corner 

53 nodes on the current mesh 121 arc refined so that the 
difference in the intensity distribution between the current 
polygon 112 and its prediction from the reference polygon 

15 12 is minimized. In order to refine the positions of the inside 
nodes 51, the hexagonal search approach is used as disclosed 
in Y. Nakaya and H. Harashima, "Motion compensation 
based on spatial transformations," IEEE Transactions Cir- 
cuits and System Video Technology, vol. 4, pp. 339-357, 
20 June 1994. It is an iterative displacement estimation method 
that evaluates candidate spatial transformations. Using hex- 
agonal search, Y. Nakaya and H. Harashima refine the 
positions of only the inside nodes 51. The positions of 
boundary nodes 52 are refined using a block matching 



warpings are discussed in detail in G. Wolberg, "Digital 
image warping," IEEE Computer Society Press, 1992, Los 
Alamitos, Calif. Affine mapping assumes three point corre- 
spondences and has six parameters: 



an 



Ol3 
«23 \ 



Since boundary nodes 52 that are on the same line must 
remain on the same line, their motion must be restricted to 
a line space during the hexagonal search. Thus, for a 
boundary node 52 the search space is one -dimensional while 
35 it is two-dimensional for an inside node 51. 

El. Refining the Locations of the Inside Nodes 51 
(J6 ) The position of each inside node 51 in the initial current 

mesh 121 is refined in an arbitrary order. Referring to FIG. 
10, let G be the current inside node 51 whose position is to 



where (x,y) and (u,v) denote the coordinates of a point 
before and after the affine mapping is applied, respectively. 
An affine map maps a rectangular block into an arbitrary 
parallelogram, giving shear, scaling, rotation and translation 



40 be refined and let Sj, S 2 , 



, S k denote the patches 122 



surrounding G 51 on the current mesh M c 121, where K is 
the number of patches 122 for G. Let the corresponding 
patches 22 on M r 21 be denoted by S r3 , S^ . . S rK . The 
first step of the hexagonal search for node G 51 is to find the 



to it. Bilinear mapping assumes four point correspondences 45 region in the current frame 114 that will be affected from the 



and has eight parameters: 



(17) 



50 



movement of G 51. This region is called the cost polygon 54 
and denoted by S, where 



Note that affine mapping is obtained from bilinear mapping 
by setting a^O and b 3 -0 in ( ). For even further details, the 
book "Digital Image Warping" can be referenced. 

Our method uses the affine mapping when the reference 
mesh 21 includes only triangular patches 22. When the 
reference mesh 21 includes only rectangular patches 22, our 
method uses only the bilinear transformation. It is also 
possible that the reference mesh 21 contains both triangular 
and rectangular patches 22, in which case our method 
employs the affine transformation for the triangular patches 
22, and the bilinear transformation for the rectangular 
patches 22. 

Due to the ratio preserving properties of the bilinear and 
affine transformations, the image intensity distribution along 



55 



The cost polygon 54 for node G 51 in the current frame 114 
can be generated very easily using the following steps: 

1 . Set i=0 and create an empty point list, 

2. Let i*-i+l, 

3. Construct patch S„ let z=size of S () 

4. Find the corner index, j, of G on patch S if 

60 5. From k-j+1 to k-j+z-1 append the (k mod z)'th comer 
of S, to point list if it is not already in the list. 

6. If i<K go to step 2. 

7. The points in the list will be clockwise ordered. 

If the reference polygon 12 is rectangular, and triangular 
65 patches 22 are used, then all cost polygons 54 turn out to be 
hexagons, as shown in FIG. 10, for inside nodes 51, hence 
the name "hexagonal search". During the search, node G 51 
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in FIG. 10(a) is moved to a new location as shown in FIG. 
10(b) in a search space, updating the deformations of the 
triangular patches 122 inside the cost polygon 54. The 
updated patches 122 in FIG. 10(6) are called S'/s, (S>-S,). 

The predicted image inside the cost polygon 54 is syn- 5 
thcsized by warping undeformed patches 22, Vs, on the 
reference frame 14, onto the deformed patches 122, S'/s, on 
the current frame. The mean absolute difference (MAD) or 
mean square difference(MSE), formulated in equation 18, is 
calculated inside the cost polygon 54, io 

ew=- — Z X IMW)-',™tW)))r 



In Equation 18, (i,j) denotes a pixel location on patch S',-, 
and MSE or MAD values are calculated by setting m to 1 or 
2, respectively. In the same equation T, denotes the back- 
ward spatial transformation, T t -:S rt «-S' ( , and N, denotes the 20 
total number of pixels inside S',. The position of G 51 that 
minimizes the MAD or MSE value is registered as the new 
location for G 51. 

Usually, pixels fall onto non-integer locations after they 
are mapped using spatial transformations. Intensities for 25 
pixels that are on non-integer locations are obtained using 
bilinear interpolation. In bilinear interpolation, intensity 
values of the four neighboring pixel locations are employed 
as explained in the following. Assume that a pixel located at 
(i j) on f c is mapped to (x,y) on f r . If a and b are the largest 30 
integers that respectively are not greater than x and y, and if 
£=x-a and Jt*y-b, then bilinear interpolation is formulated 
as follows: 

/ e (f, ;) = (I - TjVAa, b) + (I - b + 1) + W 35 

ft/,(a + l,fc+ l) + f(l-^(fl+l,A) 



12 

gon 55 gives the search space 58. Examples of search 
polygon, search window, and search space are shown in FIG. 
12 when the patches 122 around G 51 are triangular (a) and 
when the patches 122 around G 51 are quadrilateral (b). In 
the following, let A denote the search space 58. 

The optimum location for G 51 is found using a logarith- 
mic method which reduces the computational load, espe- 
cially when subpixel accuracy is applied. The block diagram 
for logarithmic search is shown in FIG, 13. 

Let d denote step size at each step of logarithmic search. 
In the first step 60 of logarithmic search, d is set to an initial 
step size, specified by the user as a power of two, and the 
level number k is set to 0, step 61. Let g* denote the image 
coordinates of node G 51 at level k of the logarithmic search, 
where g, is set to the location of G 51, step 62, 

(initial step size ^ 
accuracy / 

and the value of accuracy is specified by the user. Increment 
k by 1, step 64, and if k-1, step 65, obtain the following 
locations by sampling the search space, A, with d, step 66a: 



where ij are integers and x, y e A, (20) 

G 51 is moved to each sample location 70 given above and 
shown in FIG. 14, the image intensities inside the cost 
polygon are synthesized, and the prediction errors, E(x^)'s, 
are computed. The sample location 70 giving the lowest 
prediction error is kept as the new location for G, i.e. let 
g A+1 =x , step 68, such that E(x, y ) is minimum for x*, step 67 
Set 



The present invention discloses a method to speed up the 
original hexagonal search, and referring to FIG. 12, involves 
finding a search polygon 55 and a search space 57 as 
explained below. The search polygon for node G 51 in the 
current frame 114 is found using the following steps: 

1. Set i-0 and create an empty point list, 

2. Let i«-i+l, 

3. Construct patch S„ let z=size of S ( , 

4. Find the corner index, j, of G on patch S ( -, 

5. If z-4, patch is rectangular, append (j+1 mod z)th and 
(j+3 mod z)th comers of S ( - to point list if they are not 
already in the list. 

6. Else if z=3, patch is triangular, append (j+l mod z)th 
and (j+2 mod z)th corners of S,- to point list if they are 
not already in the list. 

7. If i<K go to step 2. 

8. Order the points in the list in clockwise order. 

9. Let F denote the polygon formed by the points found 
in Step 8. If F is convex, then F is the search polygon 
55. If F is not convex, then find the largest convex 
polygon in F such that G 51 can move within F without 
causing any overlaps as shown in FIG. 11. We note that 
this operation is different from finding the convex hull 
of a polygon (which is well known in the art). 

The search space 58 is obtained by introducing a square 
window 57 around G, whose size is specified by the user. 
Intersection of the square window 57 with the search poly- 



step 69, and k«-k+l., step 64. In the subsequent levels of the 
logarithmic search, i.e., for k>l, the search space 58 is 
limited to a 3x3 neighborhood 71 of g* at each level. This 
45 3x3 region is given by, step 66b, 

*•>•*+{)]' 

50 

where ij— 1,0,1, provided x, y € A, (21) 

We let g^-x*, step 68, such that E(x, y ) is minimum for x*, 
step 67, and set 




60 step 69, and k*-k+l, step 64 Logarithmic search is stopped 
once the desired accuracy is reached, i.e. when 6=accuracy, 
step 63. A simple realization of logarithmic search is shown 
in FIG. 14. In the exhaustive search strategy introduced in 
Nakaya, et al., initial sampling rate, 8 is set to accuracy and 

65 only the first stage of our logarithmic search method, i.e., 
k-1, is applied. Assuming an NxN search window 57 and a 
desired accuracy value of a, there are up to 
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candidates 70 for G 51 in an exhaustive search, compared to 



where [xj denotes the largest integer not greater than x, and 
s denotes the initial sampling rate, in the presently disclosed 
logarithmic search method. Thus, for example, for N«9, s-2, 
a- J /s, the presently disclosed logarithmic search approach is 
nearly 83 times faster than Nakaya, et al. 

For a boundary node 52, the search space 58 is limited on 
a line as shown in FIG. 15. The cost polygon 54 for a 
boundary node 52 is formed in the same way as for an inside 
node 51, i.e., the cost polygon 54 for a boundary node 52 is 



number of nodes 123 whose positions are refined during one 
iteration will decrease with the increasing number of itera- 
tions. During an iteration, the nodes of the mesh 121 can be 
visited in a fixed or random order. Iterations are stopped 
when there is no node whose position needs to be refined or 
when a maximum number of iterations has been reached. 
Due to repeated warpings of patches 122, this iterative 
method is computationally expensive, however, it is possible 
to process up to one third of the nodes in parallel as 
10 suggested in Nakaya, et al. 

The present invention also refines the location of each 
corner node 53. The refinement is performed at each itera- 
tion of the hexagonal search. The comer nodes are refined 
after or before all boundary 52 and inside nodes 51 are 
15 refined at each iteration. This step is introduced to refine the 
corner locations obtained as a result of corner tracking. 

Let c, denote the ^h corner of the current polygon 112 
which is also a comer node 53 of the current mesh 121. The 
problem in moving a corner node 53 during the refinement 
thToutlinV ofThe"union' oTthe patches 122 that have Tine 20 P^ss is in defining a cost polygon 54 for a comer node 53. 
boundary node 52 as one of their comers. TTie search The cost polygon 54 has to be inside the current polygon 
polygon for a boundary node 52 is defined to be the line dose to the comer node 53, and interact with the mesh 

segment whose end points are the nodes 77, 78 that are structure. This invention introduces two methods which are 
neighbors to the boundary node 52 (G) on each side. The called as "local method" and "global method". Both meth- 
movement of G 52 is also limited by a rectangular search 25 ods are based on constructing a point and a patch list as 
window 57 centered around G 52, whose size is specified by explained below. In local method, the point list is initialized 
the user. The intersection of the search polygon 55 and the with the corner node 53 (c,) and the nodes in the current 
search window 57 result in the search space 58, which is mesh 121 that are connected to c,-. Then, the nodes in the 
denoted as B. A similar logarithmic method is then applied current mesh 121 that arc connected to at least two of the 

30 nodes in the initial list are also added to the point list (this 
is because a quadrilateral can be diagonally divided into two 
triangles in two different ways, and the nodes that are 
connected to a comer node can be different for each case). 
In global method, the point list is constructed in a different 
35 way. Referring to FIG. 17, in this case, a triangle 90, denoted 
as H, is formed by joining the previous, current and next 
corners of the current polygon in clockwise order, i.e. 
H-Cf.jfi.c^^ we call this triangle as "reference corner tri- 
angle". All the nodes of the mesh 121 that lie on or inside 
40 this triangle 90 form the point list. Once the point list is 
constructed using either one of the methods discussed 
above, a list of patches in the mesh 121 that will be affected 
from the movements of all nodes in the point list is con- 
structed. The patch list is formed by the patches in the mesh 



to boundary node G 52 in terms of a distance measure as 
explained below. 

Let d denote step size at each step of logarithmic search. 
In the first step 80 of logarithmic search d is set to an initial 
step size, specified by the user as a power of two, and the 
level number k is set to 0, step 81. Let denote the image 
coordinates of grid G at level k of the logarithmic search, 
where gj-G, step 82, and 



initial step size^ 
accuracy ) 



86a; 



87 Set 



Xf*g k +ibu, i is an integer such that x, G B. (22) 
step 88, such that E(x^) is minimum for x*, step 



and the value of accuracy is specified by the user. Increment 
k by 1, step 84, and if k=l, step 85, obtain the following 

locations 74 by sampling the search space B 58 with d, step 45 121 that have as a comer at least one of the nodes in the point 

list. 

A logarithmic search strategy similar to ones discussed 
during hexagonal search method has been applied for find- 
ing the best location for the comer node 53. Search space 58 
50 of the corner node 53 is defined by a square window around 
c ( -, and denoted as 

D. The definition of the search space 58 remains the same for 
each iteration of hexagonal search. 

Referring to FIG. 18, at the first step of logarithmic search 
for the corner refinement, step size, 6 is set to the half of the 
range, step 91, and the level number k is set to 0. Let S k 
denote the fc th patch in the patch list, where k-1,2, . . . , K 
and K denotes the total number of patches in the patch list. 
Also let c it denote the coordinate of the corner c, at *th step 
of logarithmic search, where c^c,., step 93 and 



step 89, and k«-k+l., step 84 In the subsequent levels of the 
logarithmic search, i.e., for k>l, the search space is limited 
to 3 locations 75 in the neighborhood of g k These locations 
75 are calculated as, step 866, 



55 



x-g k +ibu, i— 1,0,1 provided x, e B _ 



(23) 



60 



The logarithmic search is stopped once the desired accu- 
racy is reached. The flow diagram of the logarithmic search 
method is shown in FIG. 16. 

The hexagonal search process is iteratively applied to the 
nodes 123 of the mesh 121 as explained in Nakaya, et al. 
Due to the nature of the hexagonal search mechanism, the 



*= 1,2, , 



accuracy 



65 



Increment k by 1, step 95, and obtain the following 9 
locations by sampling the search space, with d, step 96: 
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'£( (30) 



d«(i-p){Uq)d^pO-q)d^pqd^(p-l)qd, (28) 

x :j = c it + tfj ' J. f. ; = - 1, 0, 1. provided jc, v e D. ^ where 

^ x-a+pAfi+^AZ). (29) 

When the corner, c,- moves to a new location, x another Apictorial representation of interpolation for triangular and 

triangle 100 which we call as "moved half triangle", and r e Ctan gular patches 22 is shown in FIG. 19. 

denoted by H', where H-c^xxc^ is formed. Patches m the Jn c ^ Fuh and p Mara gos, "Affine models for image 

patch list are mapped using the affine transformation, A*, matching mo tion detection," in IEEE International 

between H and H*. After affine mapping is applied to patches 10 Conference Acoustic Speech and Signal Processing, pp. 

deformed patches denoted as S'* are obtained for k=l,2, . . . , 2409-2412, May 1991, Toronto, Canada a method is dis- 

K. Using the mappings between undeformed patches on the ck)sed tQ model the effects of ji lummation change on the 

reference meshes, S rk , and deformed patch S'* intensity image intensity distribution. The method disclosed in C.S. 

distribution on all patches is predicted. Using (18), E(x) is Fuh and p Maragos emp i oy the following error criterion: 

calculated, which is as an error criterion for corner move- 15 
ment. The location x* where E(x, y ) is minimum, step 97, for 

ijxzM-l ,0,1 , is kept as C, v step 97. The step size is halved, step MSE{d„ r,c)=-^ (/*<*) - rt& + d x ) - cy 

99, and the logarithmic search is continued until the step size *** 

is less than the accuracy, step 94. After *max has been 

reached logarithmic search is stopped. Reference half tri- 20 where r - s re f erre d to as the multiplicative illumination 

angle and effect of corner movement can be seen in FIG. 17. coefficient and c is referred to as the additive illumination 

Current corner's movement can also be performed coefficient. 

exhaustively. Then all locations calculated as In 0fder t0 minimize Equation 30, C.-S. Fuh and P. 

Maragos first find optimal r and c by setting the partial 

JM if mno* n mno, nmvirWi* .= d (25) 25 derivatives 5MSE/ar and 3MSE/3c to 0. This yields two 
Jt;;=c: +o\ , /, j- -range, 0. range, provided x;j e u. i j « « j 

} * [j\ hnear equations in r and c which can be solved to nnd 

optimal r* and c* as a function of d^. Letting I c (x)=I and 

are tested for smallest matching error (18). T*e location that W^+dJ-I. optimal solutions are given as follows: 

minimizes the prediction error E(x) is chosen as the new 3Q (31) 

location for c ( - among, x^-'s. , _ Zj^ZjV^ 7 

In logarithmic method initial step for corner movement is /v£ f - (£ 7) 2 
set to half of the range and at each iteration this step size is 

reduced to its half. The flow diagram for logarithmic search ^ Z^Z^'Z^Z 7 

for a comer is shown in FIG. 18. 35 c NT? -{Z if 

A comer node 53 is visited, i.e., tested for possible y } 
refinement in its position, if the current iteration is the first 

one or a neighboring node in the mesh 121 has moved before where all the summations are over x in patch 22. 

the corner node 53 is visited. Corners are visited by two j n c _s. Fuh and P. Maragos, the values of the iliumina- 

different strategy, called as "bottom-up" and "top-down" ^ t j on coefficients r and c are assumed constant for all x in 

approaches. In the bottom-up approach, corners nodes 53 are patch 22. However, in most image sequences, illumination 

visited after all the boundary 52 and inside 51 nodes are changes spatially within a frame. The present invention 

visited in every iteration. In the top-down approach, comers discloses a new method to overcome this problem. Rather 

nodes 53 are visited before the boundary 52 and inside 51 tnen assigning a pair(r,c) of illumination coefficients to each 

nodes are visited in every iteration. 45 patch 22 on the mesh 21, the presently disclosed method 

When a corner node 53 is moved to a new location during assigns a pair (r,c) of illumination coefficients to each node 

logarithmic search, the boundary of the current polygon 112 23 on the mesh 21 . The presendy disclosed invention allows 

is changed and all the nodes of the mesh 121 that are inside illumination to continuously vary within a patch 22 by 

the reference half triangle 90 are mapped with the affine obtaining the illumination at location x from the illumination 

mapping between the reference half triangle 90 and the new SQ coefficients assigned to the corners of the patch 22 using 

half triangle 100 formed when the corner node 53 has moved bilinear interpolation which is given below for a triangular 

Incorporating Illumination Chanaes mes h 22 

In the present invention, displacement vector d^ for a pixel 

location x in a patch 22 is calculated bilinearly from the r x m(up-q)r a ^pr b +qr c 

displacement vectors of the corners of the patch 22. Refer- $5 
ring to FIG. 19, given a triangular patch ABC 22 and a pixe 1 



x inside the patch 22, the pixel location can be written as wherCj ^ ^ ^ c ^ and ^ <g are lhe illumination 

x-a+pAB-niAC (26) coefficients respectively assigned to corners A, B, and C, and 

x-a+pA +qa , (t x ,c x ) corresponds to the illumination at point x in patch 22. 

where a gives the position of point A. eo In order to estimate the illumination coefficients (r, c) for 

If d a ,d fc ,d c denote the displacements of comers A,B,C, each node, the present invention discloses two different 

respectively, displacement of pixel x, d^, is calculated using, methods. In the first method, r and c are assumed to be 

- constant inside the cost polygon 54 of an inside or boundary 

<Wi-/~?K+/*Wc- ( 27 > nodC) and vahies for r and c found ^ a result 0 f hexagonal 

If the patch 22 is rectangular, given the displacements of 65 search are assigned to the node. For a comer node 53, the 

each comer, displacement of a pixel x inside the rectangle presently disclosed invention assigns a weighted average of 

ABCD is calculated using r and c values calculated on the patches which have the 
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corner node as one of their corners. The weights are deter- 
mined by the area of each patch. 

In the second method which is called as "interpolation 
method", r and c values are allowed to continuously vary for 
each pixel location inside the cost polygon 54 during hex- 
agonal search. This method is used only for inside and 
boundary nodes. The bilinear interpolation method that has 
been mentioned above is applied for calculating these val- 
ues. Let G denote a node and let K denote the number of 
patches that are in the cost polygon associated with G. Let 
S*, k«l,2, . . . , K denote the patches that are in the cost 
polygon associated with node G. Let k-1, ... K, in 
clockwise (or counter-clockwise) order represent the nodes 
of the mesh that are on the cost polygon associated with 
node G. Then, the error criterion used in the second method 
is given by: 



MSE{d x , r Cl c c ) = (/ < (x) ~ r * lr[x + dj) ~ Cx)1 



(33) 



where 



and 
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Synthetic Transfiguration 

One important application of the invention is in the area 
of synthetic object transfiguration where an object, such as 
the contents of a billboard, is replaced by a new object and 

5 rendered throughout the sequence in the same manner as the 
original object. The application of the method to synthetic 
transfiguration is described below. 

Referring to FIG. 3(a), once the 2-D meshes M,, . . . , 
are found which represent the global and local motion of the 

10 reference object 11 to be replaced, first, the reference mesh 
M r 21 is mapped onto P n 19 using a spatial transformation 
between ? R and the reference polygon P^, where the poly- 
gon ? R 19 defines the boundary of the replacement object 17. 
For the transfiguration application, it is required that the 

15 spatial transformation between and P r can be represented 
by an affine transformation. Using the transformation 
between P* 19 and P r 12, the reference mesh M r 21 is 
mapped onto f^ 18 to obtain the mesh Mj, 16 on the 
replacement object 17. Then, the following backward map- 

20 pings are computed 



, N, rt-1,2, 



(34) 
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where (r^ c^) are the illumination coefficients for node G^. 30 
in the above expression (r^, c^) are fixed and assumed to 
be known. During the first iteration of the search, it is 
possible that some G^'s are not visited prior to G. To 
overcome this problem, the present invention assigns initial 
values to illumination coefficients on every node in the 35 
mesh. The initial values for the multiplicative and additive 
illumination coefficients are either respectively set equal to 
1.0 and 0.0, to their previous values calculated on the 
previous image. 

Hierarchical Hexagonal Search 40 

The present invention also discloses a method, henceforth 
called "hierarchical hexagonal search method," to imple- 
ment the method of (E) in a hierarchy of spatial resolutions. 
Referring to the example given in FIG. 20, once the refer- 
ence mesh 21 is tracked into the current frame 114 and the 45 
current mesh 121 is obtained, new inside and boundary 
nodes, henceforth called high resolution nodes 141, are 
added to the mesh 121 half way on each link 140 that 
connect two nodes 123 in the mesh 121. Once the high- 
resolution nodes are added to the mesh 121, step 50 of the 50 
present invention is repeated to further refine the locations of 
the low-resolution 123 and the locations of the high- 
resolution 141 nodes. Once step 50 is completed with 
high-resolution nodes in the mesh 121, still higher resolution 
nodes can be added to the mesh 121 and step 50 then 55 
repeated any number of times. At this point, it is important 
to note that, only the original low-resolution nodes 123 are 
mapped to a subsequent frame in step 40 to find the initial 
low- resolution mesh in the subsequent frame 

The advantages of the hierarchical hexagonal search 60 
methods are: (1) It is less sensitive to the initial patch size 
selected by the user as the size of the patches are varied 
during the hierarchical search process (2) it is computation- 
ally faster as large local motion can be tracked in less time 
with larger patches than with smaller patches, and smaller 65 
patches can track local motion in less time when their initial 
location is determined by the movement of large patches. 



where H n>m is the backward mapping between the m th patch 
on the rt th mesh, M„, and the m ih patch on the replacement 
mesh, M*, N denotes the total number of patches in each 

mesh, and 1 denotes the number of frames in the given 
image sequence. 

If illumination changes are observed during the process of 
mesh tracking, they are also incorporated on the transfigured 
object 17 using 



U^rJ R {H^c x for all x e 



(35) 



where \„ and l R respectively denote the image intensity 
distribution at the n-th and replacement frame. The multi- 
plicative illumination coefficient r x and the additive illumi- 
nation coefficient c x are obtained by bilinearly interpolating 
the multiplicative and additive illumination coefficients 
found for the corners of the patch M during the process 
of mesh tracking. The details of the bilinear interpolation 
used for computing r^ and c^. are disclosed above. 
We claim: 

1. A method for tracking a first predetermined, two- 
dimensional portion of an image throughout a sequence of 
images, the method comprising the steps of: 

(a) selecting a reference frame; 

(b) selecting the predetermined, two-dimensional portion 
within the reference frame by choosing a reference 
polygon having at least three comers that defines the 
boundary of the first predetermined region; 

(c) fitting a reference mesh having comer nodes at the 
comers of the reference polygon and at least one inside 
node inside the reference polygon; 

(d) predicting the reference polygon in subsequent or 
previous image frames by independently tracking the 
comers of the reference polygon; 

(el) dividing the reference polygon and the tracked poly- 
gon into a minimum number of triangles so thai each 
triangle in the reference polygon respectively corre- 
sponds to a triangle in the tracked polygon; 

(e2) finding parameters of afEne transformation between 
each corresponding pair of triangles; and 

(e3) mapping nodes in each triangle of the reference 
polygon into the respective triangle in the tracked 
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polygon using the parameters of the corresponding 
affine transformation used for the triangle in which the 
node is located; 

(f) refining locations of the inside and corner nodes of the 
corresponding mesh for tracking local and global 5 
motion of the first predetermined portion, wherein the 
steps (c) to (f) are implemented in a hierarchy of spatial 
resolutions; 

(g) refining the location of boundary nodes on the refer- 
ence mesh for tracking the local motion around the 10 
boundary of the first predetermined portion; 

(h) tracking illumination changes that occurred between 
the reference frame and a previous or subsequent 
frame; and 

(i) replacing the first predetermined portion with a second 15 
predetermined portion throughout a portion of the 
sequence of images so that the second predetermined 
portion undergoes the same global and local motion as 
the first predetermined portion; wherein the corner, 
inside and boundary nodes divide the reference mesh 20 
into either triangular or rectangular patches or a com- 
bination of both triangular and rectangular patches; 
wherein step (d) includes: (dl) selecting a motion 
model for the corner nodes; (d2) assigning a cost 
polygon to each corner node; and (d3) estimating 25 
parameters of the motion model for each cost polygon 
and (d4) mapping the corner nodes with the estimated 
motion parameters, 

2. The method as in claim 1, wherein step (d3) further 
includes defining a maximum range for estimating the 30 
parameters of the motion model. 

3. The method as in claim 2, further comprising the step 
of refining the location of boundary nodes on the tracked 
polygon. 

4. The method as in claim 3 further comprising the step of 35 
fitting a mesh to the second predetermined portion that 
corresponds in nodes and patches to the mesh in the refer- 
ence polygon. 

5. The method as in claim 4 further comprising the steps 

of finding parameters of affine transformation between each 40 
corresponding pair of patches in the second predetermined 
portion and the tracked polygon in the previous and subse- 
quent image frames. 

6. The method as in claim 5 further comprising the step of 
mapping pixels in each patch in the second predetermined 45 
portion into the corresponding patch in the tracked polygon 
using the parameters of the corresponding affine transfor- 
mation. 

7. An article of manufacture comprising: 

a computer usable medium having computer readable so 
program means embodied therein for causing tracking 
of a first predetermined, two-dimensional portion of an 
image throughout a sequence of images, the computer 
readable program code means in said article of manu- 
facture comprising: 55 

(a) computer readable program means for causing the 
computer to effect selecting a reference frame; 

(b) computer readable program means for causing the 
computer to effect selecting the first predetermined, 
two-dimensional portion within the reference frame 60 
by choosing a reference polygon having at least three 
comers that defines the boundary of the first prede- 
termined portion; 

(c) computer readable program means for causing the 
computer to effect fitting a reference mesh having 65 
comer nodes at the comers of the reference polygon 
and at least one inside the reference polygon; 



(d) computer readable program means for causing the 
computer to effect predicting the reference polygon 
in subsequent or previous image frames by indepen- 
dently tracking the corner of the reference polygon; 

(e) computer readable program means for dividing the 
reference polygon and the tracked polygon into a 
minimum number of triangles so that each triangle in 
the reference polygon respectively corresponds to a 
triangle in the tracked polygon; for finding param- 
eters of affine transformation between each corre- 
sponding pair of triangles; and for mapping nodes in 
each triangle of the reference polygon into the 
respective triangle in the tracked polygon using the 
parameters of the corresponding affine transforma- 
tion used for the triangle in which the node is 
located; 

(f) computer readable program means for causing the 
computer to effect refining locations of the inside and 
corner nodes of the corresponding mesh for tracking 
local and global motion of the first predetermined 
portion; 

(g) means for causing the computer to effect defining 
the location of boundary nodes on the reference 
mesh for tracking the local motion around the bound- 
ary of the predetermined portion; 

(h) means for causing the computer to effect tracking 
illumination changes that occurred between the ref- 
erence frame and a previous or subsequent frame; 

(i) means for causing said (c), (d), (e) and (f) computer 
readable program means to be implemented in a 
hierarchy of spatial resolutions; 

(j) means for causing the computer to effect replacing 
the first predetermined portion with a second prede- 
termined portion throughout a portion of the 
sequence of images so that the second predetermined 
portion undergoes the same global and local motion 
as the first predetermined portion, wherein the 
corner, inside and boundary nodes divide the refer- 
ence mesh into either triangular or rectangular 
patches or a combination of both triangular and 
rectangular patches; and 

(k) means for selecting a motion model for the corner 
nodes; for assigning a cost polygon to each corner 
node; for estimating parameters of the motion model 
for each cost polygon; and for mapping the comer 
nodes with the estimated motion parameters. 

8. The article of manufacture as in claim 7 further 
comprising computer readable program means for defining 
a maximum range for estimating the parameters of the 
motion model. 

9. The article of manufacture as in claim 8 further 
comprising computer readable program means for refining 
the location of boundary nodes on the tracked polygon. 

10. The article of manufacture as in claim 9 further 
comprising computer readable program means for fitting a 
mesh to the second predetermined portion that corresponds 
in nodes and patches to the mesh in the reference polygon. 

11. The article of manufacture as in claim 10 further 
comprising computer readable program means for finding 
parameters of affine transformation between each corre- 
sponding pair of patches in the second predetermined por- 
tion and the tracked polygon in the previous and subsequent 
image frames. 

12. The article of manufacture as in claim 11 further 
comprising computer readable program means for mapping 
pixels in each patch in the second predetermined portion into 
the corresponding patch in the tracked polygon using the 
parameters of the corresponding affine transformation. 
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13. A method for tracking a first predetermined, two- 
dimensional portion of an image throughout a sequence of 
images, the method comprising the steps of: 

(a) selecting a reference frame; 

(b) selecting the predetermined, two-dimensional portion 
within the reference frame by choosing a reference 
polygon having at least three corners that defines the 
boundary of the first predetermined region; 

(c) fitting a reference mesh having corner nodes at the 
corners of the reference polygon and at least one inside 
node inside the reference polygon; 

(d) predicting the reference polygon in subsequent or 
previous image frames by independently tracking the 
comers of the reference polygon; 

(e) predicting a corresponding mesh in the subsequent or 
previous image frames by mapping the reference mesh 
into the tracked polygon using a plurality of different 
afifine transformations; 

(f) refining locations of the inside and corner nodes of the 
corresponding mesh for tracking local and global 
motion of the first predetermined portion; wherein the 
steps (c) to (f) are implemented in a hierarchy of spatial 
resolutions; 



10 



15 
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(g) refining the location of boundary nodes on the refer- 
ence mesh for tracking the local motion around the 
boundary of the first predetermined portion; 

(h) tracking illumination changes that occurred between 
the reference frame and a previous or subsequent 
frame; 

(i) replacing the first predetermined portion with a second 
predetermined portion throughout a portion of the 
sequence of images so that the second predetermined 
portion undergoes the same global and local motion as 
the first predetermined portion; wherein the corner, 
inside and boundary nodes divide the reference mesh 
into either triangular or rectangular patches or a com- 
bination of both triangular and rectangular patches; and 
wherein step (d) includes: (dl) selecting a motion 
model for the corner nodes; (d2) assigning a cost 
polygon to each comer node; and (d3) estimating 
parameters of the motion model for each cost polygon 
and (d4) mapping the corner nodes with the estimated 
motion parameters. 
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